-
Notifications
You must be signed in to change notification settings - Fork 0
Add v1 data #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add v1 data #59
Conversation
forsyth2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@golaz This is ready for an initial review, but more work still needs to be done. Please see my questions/comments made as part of this self-review. Thanks!
The three goals of the epic are:
- Centralize v1 data on HPSS archive
- Almost all v1 data is now on NERSC HPSS under
/home/projects/e3sm/www/WaterCycle/E3SMv1/ - At this point, I only have one simulation left, which is the 307 TB "HR v1 1950 control (56-135)" listed on Confluence. Back-of-the-envelope calculations showed a
hsi cpwould take over 2 days, buthsi mvwould be instantaneous.- @ndkeen if you don't need the data to be in
/home/n/ndk/2019/theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG, can you please move it to/home/projects/e3sm/www/WaterCycle/E3SMv1/HR/theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG?
- @ndkeen if you don't need the data to be in
- Add v1 documentation page to e3sm_data_docs
- This is
docs/source/v1/WaterCycle/simulation_data/simulation_table.rst, which can be seen rendered at https://portal.nersc.gov/cfs/e3sm/forsyth/data_docs_59/html/v1/WaterCycle/simulation_data/simulation_table.html.
- Update ESGF links for native output
- What are the correct links/templates/patterns to use?
docs/source/v1/WaterCycle/index.rst
Outdated
|
|
||
| Experiments: | ||
|
|
||
| The datasets include the following experiments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This covers LR, what about HR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For LR, we also need to list the future projection simulations (and paper: https://doi.org/10.5194/gmd-15-3941-2022).
- For HR, the main reference paper should be: https://doi.org/10.1029/2019MS001870
| @@ -0,0 +1,111 @@ | |||
| ********************************** | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This page can be seen in rendered form at https://portal.nersc.gov/cfs/e3sm/forsyth/data_docs_59/html/v1/WaterCycle/simulation_data/simulation_table.html.
| @@ -0,0 +1,52 @@ | |||
| model_version, group, resolution, category, simulation_name, machine, checksum, experiment, ensemble_num, link_type, node, | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I recall, for the v2 simulations, each simulation had a Confluence page that listed the values to use for
checksum, but I don't know ifv1had that too. - It was a little easier to deduce the
experiment&ensemble_numforLRsimulations. What should theHRsimulationexperiments be? - What should the
link_type(CMIP only, naitve, both?) be for these simulations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we had checksums. The machine is long gone, so we cannot reproduce these simulations anyway.
utils/simulations_v1_water_cycle.csv
Outdated
| v1, WaterCycle, HR, DECK, 20211021-maint-1.0-tro.A_WCYCLSSP585_CMIP6_HR.ne120_oRRS18v3_ICG.unc12-3rd-attempt, ???, , ssp5_8.5, 1, , , | ||
| v1, WaterCycle, HR, DECK, 20200517-maint-1.0-tro.A_WCYCL20TRS_CMIP6_HR.ne120_oRRS18v3_ICG.unc11, ???, , , , , , | ||
| v1, WaterCycle, HR, DECK, 202101027-maint-1.0-tro.A_WCYCL20TRS_CMIP6_HR.ne120_oRRS18v3_ICG.unc12, ???, , , , , , | ||
| v1, WaterCycle, HR, DECK, theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG, , theta, , , , , | ||
| v1, WaterCycle, HR, DECK, 20210112.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG.unc06, ???, , , , , , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 4 simulations, I couldn't deduce what machine they were run on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know either. Maybe that's not important.
|
|
||
| Scripts are not available to reproduce v1 simulations. | ||
|
|
||
| Original run scripts (the scripts that were originally used to create the simulations) have been archived here `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v1/original/>`_. These latter scripts are provided for reference only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming we should add original scripts, but I can't seem to find them.
Example:
> zstash ls --hpss=/home/projects/e3sm/www/WaterCycle/E3SMv1/HR/cori-knl.20190214_maint-1.0.F2010C5-CMIP6-HR.ARE.nudgeUV.1850aero.ne120_oRRS18v3 *.sh
For help, please see https://e3sm-project.github.io/zstash. Ask questions at https://github.com/E3SM-Project/zstash/discussions/categories/q-a.
build/ice/source/core_atmosphere/physics/checkout_data_files.sh
build/ice/source/core_ocean/get_cvmix.sh
build/ocn/source/core_atmosphere/physics/checkout_data_files.sh
build/ocn/source/core_ocean/get_BGC.sh
build/ocn/source/core_ocean/get_cvmix.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/Bryan-Lewis/BL_test.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/common/build.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/common/check_inputdata.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/common/environ.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/common/parse_inputs.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/common/run.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/common/usage.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/double_diff/double_diff-test.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/kpp/kpp-test.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/shear/shear-test.sh
build/ocn/source/core_ocean/.cvmix_all/reg_tests/tidal-Simmons/Simmons-test.sh
case_scripts/.env_mach_specific.sh
test01/case_scripts/.env_mach_specific.sh
None of these look like a production run script...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we list the scripts on Confluence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally found the v2 simulation Confluence pages, e.g., v2.LR.piControl but I'm not seeing clear equivalents anywhere from v1. In any case, that page links to a run script, and from there it looks like there may be some v1 run scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run scripts might be in provenance. Put in the index.rst search for case_scripts/run_script_provenance/, they can use zstash extract to get those files. We don't have to worry about adding run scripts to this repo.
forsyth2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most importantly, we need correct ESGF links. Other than that, it would be nice to categorize/describe the simulations a little better.
I added what run scripts I could find, but not all of them appear to be available anywhere.
docs/source/v1/WaterCycle/index.rst
Outdated
| TODO: Find remaining original run scripts | ||
| TODO: Add descriptions for added LR simulations above | ||
| TODO: Correctly categorize HR simulations | ||
| TODO: Determine correct CMIP/Native Links |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining TODOs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also change group to e3sm in /home/projects/e3sm/www/WaterCycle/E3SMv1/. If @ndkeen can move the remaining simulation then we'll also need the e3sm group for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running chgrp -R e3sm /home/projects/e3sm/www/WaterCycle/E3SMv1/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guys, stop tagging me, pls. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that! Thanks @rljacob for updating the references to the correct tag!
forsyth2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes from meeting with @golaz
- LRtunedHR -- LR with HR parameters, leave them under HR. Add "LRtunedHR" group for simulations that contain that substring
- nudgeUV doesn't need further categorization, U (W/E) V (N/S) are wind directions.
docs/source/v1/WaterCycle/index.rst
Outdated
|
|
||
| * DAMIP | ||
|
|
||
| * damip_hist-GHG 3 ensembles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GHG only
docs/source/v1/WaterCycle/index.rst
Outdated
| * ssp5-8.5 5 ensembles | ||
| * damip_ssp5-8.5-GHG 3 ensembles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future projection
future projection with GHG only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ensemble members" not ensembles
|
|
||
| Scripts are not available to reproduce v1 simulations. | ||
|
|
||
| Original run scripts (the scripts that were originally used to create the simulations) have been archived here `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v1/original/>`_. These latter scripts are provided for reference only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run scripts might be in provenance. Put in the index.rst search for case_scripts/run_script_provenance/, they can use zstash extract to get those files. We don't have to worry about adding run scripts to this repo.
docs/source/v1/WaterCycle/index.rst
Outdated
| * *The DOE E3SM Coupled Model Version 1: Overview and Evaluation at Standard Resolution* `doi: 10.1029/2018MS001603 <https://doi.org/10.1029/2018MS001603>_` | ||
| * *Description of historical and future projection simulations by the global coupled E3SMv1.0 model as used in CMIP6* `doi:10.5194/gmd-15-3941-2022 <https://doi.org/10.5194/gmd-15-3941-2022>_` | ||
| * *The DOE E3SM Coupled Model Version 1: Description and Results at High Resolution* `doi:10.1029/2019MS001870 <https://doi.org/10.1029/2019MS001870>`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get links to show up correctly
| @@ -37,22 +37,25 @@ def get_data_size_and_hpss(hpss_path: str) -> Tuple[str, str]: | |||
| hpss = "" | |||
| return (data_size, hpss) | |||
|
|
|||
| def get_esgf(source_id: str, model_version: str, experiment: str, ensemble_num: str, cmip_only: str, node: str) -> str: | |||
| def get_esgf(source_id: str, model_version: str, experiment: str, ensemble_num: str, link_type: str, node: str) -> str: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sasha might know about what the native links should be. Tony may know as well.
We may not be able to find the links.
forsyth2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging this. Notes below (and in comments linked to this review).
Preview is at https://portal.nersc.gov/cfs/e3sm/forsyth/data_docs_59/html/v1/WaterCycle/index.html. Once merged, the updates will be visible at https://docs.e3sm.org/e3sm_data_docs/_build/html/v1/WaterCycle/index.html.
This PR resolves this epic:
- Centralize v1 data on HPSS archive
- That is, moving data from modelers'
homedirectories to/home/projects/e3sm/www/WaterCycle/E3SMv1/. - This is complete except for
theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG(see comment). - @TonyB9000 This is specifically about the "Zstash-archived simulation output", not the "Formerly-published E3SM datasets previously accessible vi ESGF links to /p/user_pub" you mentioned in an email.
- Add v1 documentation page to e3sm_data_docs
- These are the
index.rstpages added in this PR, describing the simulations.
- Update ESGF links for native output
- This is being out-of-scoped from this PR, since the v1 data is no longer directly available on ESGF.
- @TonyB9000 for reference, the idea would have been to have ESGF links similar to those on v2, where we have a CMIP & Native links:
https://esgf-node.llnl.gov/search/cmip6/?source_id=E3SM-2-0&experiment_id=piControl&variant_label=r1i1p1f1
https://esgf-node.llnl.gov/search/e3sm/?model_version=2_0&experiment=piControl&ensemble_member=ens1
So, remaining action items after merging this PR:
- Move
theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICGto centralized location. - Determine some way to show ESGF links for v1 data.
| E3SMv1 (Water Cycle) | ||
| ==================== | ||
|
|
||
| The `E3SM version 1 water cycle simulation campaign <https://e3sm.org/research/water-cycle/v1-water-cycle/>`_ includes standard set of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"website is undergoing maintenance"
| with ocean and sea ice grid of 60 km in the mid-latitudes and 30 km at the equator and poles, | ||
| and river transport at 55 km horizontal resolution. | ||
| This model configuration is described in | ||
| `“v1 1 deg CMIP” <https://e3sm.org/model/scientifically-validated-configurations/v1-configurations/v1-1-deg-cmip6/?preview=true>`_ page |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"website is undergoing maintenance"
| and river transport at 55 km horizontal resolution. | ||
| This model configuration is described in | ||
| `“v1 1 deg CMIP” <https://e3sm.org/model/scientifically-validated-configurations/v1-configurations/v1-1-deg-cmip6/?preview=true>`_ page | ||
| in `Scientifically Validated Configurations <https://e3sm.org/model/scientifically-validated-configurations/>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"website is undergoing maintenance"
| in `Scientifically Validated Configurations <https://e3sm.org/model/scientifically-validated-configurations/>`_. | ||
|
|
||
| For more details, | ||
| refer to `Coupled E3SM v1 Model Overview <https://e3sm.org/?p=5470>`_ or to the reference papers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"website is undergoing maintenance"
|
|
||
| Scripts are not available to reproduce v1 simulations. | ||
|
|
||
| Some original run scripts (the scripts that were originally used to create the simulations) have been archived here `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v1/original/>`_. If a script is not collected here, you can try looking for the provenance run script with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link will work once this PR is merged.
| v1, WaterCycle, LR, Projection, 20191019.DECKv1b_P3_SSP5-8.5-GHG.ne30_oEC.cori-knl, cori-knl, , damip_ssp5-8.5-GHG, 3, none, , | ||
| v1, WaterCycle, HR, Control Runs, theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG, theta, , 1950S, 1, none, , | ||
| v1, WaterCycle, HR, Control Runs, theta.20190910.branch_noCNT.n438b.unc03.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG, theta, , 1950S, 2, none, , | ||
| v1, WaterCycle, HR, Control Runs, theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG, theta, , 1950S, 3, none, , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be moved to the centralized location:
hsi
mv /home/n/ndk/2019/theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG /home/projects/e3sm/www/WaterCycle/E3SMv1/HR/theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG
mv must be used because cp, by back-of-the-envelope calculation, would take over 50 hours. mv can only done be the file owner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ndkeen I don't actually need a full review here, I just need you to do the above. Thanks!
|
@forsyth2 This looks great Ryan. Re-establishing ESGF access to the "formerly published" v1 sets is something we might engineer - depending upon demand and priorities. Had I known there was to be an interest in maintaining ESGF access, I think we might have engineered a different archive format (maybe, "per-dataset" tar-files, etc). |
@forsyth2 thanks for working on this. For the ESGF links, what we should include is the CMIP formated v1 simulations. example: 20180129.DECKv1b_piControl.ne30_oEC.edison has ESGF url: https://esgf-node.ornl.gov/search?project=CMIP6&activeFacets=%7B%22institution_id%22%3A%22E3SM-Project%22%2C%22source_id%22%3A%22E3SM-1-0%22%2C%22experiment_id%22%3A%22piControl%22%7D Also, I think we should add v1 Large Ensemble as well. Both can be added in a new PR. |
I thought we weren't planning to add them, which is why I didn't include them initially. But yes, we can certainly add those in as well. I sent a support ticket into NERSC with you cc'd for guidance on copying the data. (Let me know if we're happy with the |
Replacement of #46, to resolve #43.